[1] "Gold River" "LaHave River" "Musquodoboit River"
[4] "Roseway River" "Round Hill River" "Salmon River"
[7] "Tusket River" "Liscomb River" "Mersey River"
Flagging Inland Data - Explore Rolling SD
Summary
CMAR has collected data on several inland bodies of freshwater in Nova Scotia, including lakes and rivers.
CMAR intends to process and publish all inland data under a new “Inland” branch of the Coastal Monitoring Program. Data will be processed in a similar manner to the coastal water quality data, and data flags will be applied using the qaqcmar package.
It is suspected that sensors on some rivers were out of the water for some period of time during the deployment due to low water levels. Data flagging efforts will flag data for periods of time sensors were suspected to be exposed. During the periods in which sensors were exposed to air, recorded temperatures fluctuate more quickly than when sensors are submerged.
The purpose of this document is to help CMAR determine appropriate data flagging tests and thresholds for freshwater (inland) data. We do not currently have enough freshwater data to conduct as thorough an analysis as was done on the coastal water quality data to develop tests and thresholds, so thresholds may be picked in more subjective ways. Note, this initial threshold analysis has been completed on a subset of data.
Waterbodies included in threshold analysis:
Stations included in threshold analysis:
[1] "Gold River 2" "LaHave River 1" "LaHave River 3"
[4] "Musquodoboit River 1" "Musquodoboit River 2" "Musquodoboit River 3"
[7] "Roseway River 1" "Roseway River 2" "Round Hill River 1"
[10] "Round Hill River 2" "Round Hill River 3" "Salmon River 1"
[13] "Salmon River 2" "Tusket River 1" "Tusket River 2"
[16] "LaHave River 2" "Liscomb River 1" "Liscomb River 2"
[19] "Mersey River 2" "Tusket River 3"
Stations which may have experienced air exposure:
- Liscomb 1
- Liscomb 2
- LaHave 2
- Mersey 2
- Tusket 3
- Possibly Musquodoboit 1 and 2
Data visualization
Station locations
Approximate location of stations included in threshold analysis.
Plot uncleaned station data
Plot cleaned station data
Suspected outliers have been removed from the following datasets:
- Liscomb 1
- Liscomb 2
- LaHave 2
- Mersey 2
- Tusket 3
The cleaned datasets will be used to generate the grossrange thresholds.
Statistical overview
Distribution of sd_roll
Distribution all
Calculate rolling standard deviation thresholds
Compare various methods for calculating thresholds.
Visualize flagged data
Visualize data flagged using various methods, to determine which method produces the best results. This time the thresholds have been applied to all of the inland datasets, not just the cleaned ones used to generate the thresholds.
Mean_sd
Quartile
Quartile pooled 0.95
Quartile pooled 0.97
Quartile pooled 0.99
Quartile pooled 0.997
Apply rolling SD threshold
Visualize flagged data - all datasets
Due to the right-skew of the sd_roll distribution plots, the quantile method was used to establish thresholds. Because the overall distribution of the data was relatively similar for each station, data has been pooled to determine one rolling standard deviation threshold to be used to flag all inland datasets.
The final threshold value chosen was q99.7: 2.04